DNN Speech Recognizer
Files Submitted
Criteria | Meet Specification |
---|---|
Submission Files |
The submission includes all required files. |
STEP 2: Model 0: RNN
Criteria | Meet Specification |
---|---|
Trained Model 0 |
The submission trained the model for at least 20 epochs, and none of the loss values in |
STEP 2: Model 1: RNN + TimeDistributed Dense
Criteria | Meet Specification |
---|---|
Completed |
The submission includes a |
Trained Model 1 |
The submission trained the model for at least 20 epochs, and none of the loss values in |
STEP 2: Model 2: CNN + RNN + TimeDistributed Dense
Criteria | Meet Specification |
---|---|
Completed |
The submission includes a |
Trained Model 2 |
The submission trained the model for at least 20 epochs, and none of the loss values in |
STEP 2: Model 3: Deeper RNN + TimeDistributed Dense
Criteria | Meet Specification |
---|---|
Completed |
The submission includes a |
Trained Model 3 |
The submission trained the model for at least 20 epochs, and none of the loss values in |
STEP 2: Model 4: Bidirectional RNN + TimeDistributed Dense
Criteria | Meet Specification |
---|---|
Completed |
The submission includes a |
Trained Model 4 |
The submission trained the model for at least 20 epochs, and none of the loss values in |
STEP 2: Compare the Models
Criteria | Meet Specification |
---|---|
Question 1 |
The submission includes a detailed analysis of why different models might perform better than others. |
STEP 2: Final Model
Criteria | Meet Specification |
---|---|
Trained Final Model |
The submission trained the model for at least 20 epochs, and none of the loss values in |
Completed |
The submission includes a |
Question 2 |
The submission includes a detailed description of how the final model architecture was designed. |
Tips to make your project standout:
(1) Add a Language Model to the Decoder
The performance of the decoding step can be greatly enhanced by incorporating a language model. Build your own language model from scratch, or leverage a repository or toolkit that you find online to improve your predictions.
(2) Train on Bigger Data
In the project, you used some of the smaller downloads from the LibriSpeech corpus. Try training your model on some larger datasets - instead of using dev-clean.tar.gz
, download one of the larger training sets on the website.
(3) Try out Different Audio Features
In this project, you had the choice to use either spectrogram or MFCC features. Take the time to test the performance of both of these features. For a special challenge, train a network that uses raw audio waveforms!